Name | Version | Summary | date |
torch-tensorrt |
2.8.0 |
Torch-TensorRT is a package which allows users to automatically compile PyTorch and TorchScript modules to TensorRT while remaining in PyTorch |
2025-08-09 06:02:00 |
b10-xgrammar |
0.1.22 |
Efficient, Flexible and Portable Structured Generation |
2025-08-08 23:28:16 |
speculators |
0.1.0 |
A unified library for creating, representing, and storing speculative decoding algorithms for LLM serving such as in vLLM. |
2025-08-08 01:22:17 |
emdcmp |
1.1.1 |
Original implementation of the EMD (empirical model discrepancy) model comparison criterion |
2025-08-06 08:33:30 |
mblt-model-zoo |
0.0.3.1 |
A collection of pre-quantized AI models for Mobilint NPUs. |
2025-08-06 05:30:28 |
verbatim-llm |
0.1.1 |
Library to mitigate verbatim or near-verbatim memorization in LLMs |
2025-08-05 01:16:35 |
langchain-qualcomm-inference-suite |
1.0.0 |
An integration package connecting Qualcomm AI Inference Suite and LangChain |
2025-08-04 23:54:22 |
runlocal-hub |
0.1.7 |
Python client for benchmarking and validating ML models on real devices via RunLocal API |
2025-08-04 17:40:43 |
anemoi-inference |
0.7.0 |
A package to run inference from data-driven forecasts weather models. |
2025-08-04 10:49:09 |
celux |
0.7.0 |
Lightspeed video decoding directly into tensors! |
2025-08-03 17:56:31 |
yaicli |
0.8.11 |
A simple CLI tool to interact with LLM |
2025-08-03 11:13:22 |
llmq |
0.0.2 |
High-Performance vLLM Job Queue Package |
2025-08-02 16:10:07 |
nndeploy |
0.2.2 |
Workflow-based Multi-platform AI Deployment Tool |
2025-08-01 03:37:02 |
optimum-rbln |
0.8.2 |
Optimum RBLN is the interface between the HuggingFace Transformers and Diffusers libraries and RBLN accelerators. It provides a set of tools enabling easy model loading and inference on single and multiple rbln device settings for different downstream tasks. |
2025-07-31 06:23:23 |
azure-ai-projects |
1.0.0 |
Microsoft Azure AI Projects Client Library for Python |
2025-07-31 02:09:27 |
optimum |
1.27.0 |
Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality. |
2025-07-30 16:40:44 |
llm-execution-time-predictor |
0.1.2 |
LLM batch inference latency predictor and profiler CLI tool |
2025-07-29 03:16:27 |
autonomize-model-sdk |
1.1.60 |
SDK for creating and managing machine learning pipelines. |
2025-07-28 16:46:10 |
xgrammar |
0.1.22 |
Efficient, Flexible and Portable Structured Generation |
2025-07-27 23:13:06 |
aikitx |
1.0.0 |
A comprehensive GUI toolkit for Large Language Models (LLMs) with GGUF support, document processing, email automation, and multi-backend inference |
2025-07-25 19:44:31 |